Example-Based Machine Translation Based on the Synchronous SSTC Annotation Schema
نویسنده
چکیده
In this paper, we describe an Example-Based Machine Translation (EBMT) system for EnglishMalay translation. Our approach is an examplebased approach which relies sorely on example translations kept in a Bilingual Knowledge Bank (BKB). In our approach, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is used to annotate both the source and target sentences of a translation pair. Each SSTC describes a sentence, a representation tree as well as the correspondences between substrings in the sentence and subtrees in the representation tree. With both the source and target SSTCs established, a translation example in the BKB can then be represented effectively in terms of a pair of synchronous SSTCs. In the process of translation, we first try to build the representation tree for the source sentence (English) based on the example-based parsing algorithm as presented in [1]. By referring to the resultant source parse tree, we then proceed to synthesis the target sentence (Malay) based on the target SSTCs as pointed to by the synchronous SSTCs which encode the relationship between source and target SSTCs.
منابع مشابه
Synchronous Structured String-Tree Correspondence (S-SSTC)
In this paper, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is introduced. We propose a variant of SSTC called synchronous SSTC. Synchronous SSTC can be used to describe the correspondence between different languages. We will also describe how synchronous SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other syn...
متن کاملA Synchronization Structure Of SSTC And Its Applications In Machine Translation
In this paper, a flexible annotation schema called (SSTC) is introduced. In order to describe the correspondence between different languages, we propose a variant of SSTC called synchronous SSTC (S-SSTC). We will also describe how S-SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other synchronous formalisms. The proposed S-SSTC schema is well sui...
متن کاملLearning-to-Translate Based on the S-SSTC Annotation Schema
We present the S-SSTC framework for machine translation (MT), introduced in 2002 and developed since as a set of working MT systems (SiSTeC-ebmt). Our approach is example-based, but differs from other EBMT approaches in that it uses alignments of string-tree alignments, and in that supervised learning is an integral part of the approach. Our model directly deals with three main difficulties in ...
متن کاملApplication of Translation Corresponding Tree (TCT) Annotation Schema for Chinese to Portuguese Machine Translation
In Example Based Machine Translation (EBMT) research, there are three main approaches: Surface Based, Pattern Based and Structure Based approach. In Structure Based EBMT system, such as SSTC approach [1], it has a problem that it relies on two syntax parsers to analyze the translation examples, but robust syntax parsers are not always available. On the other hand, Chinese and Portuguese belong ...
متن کاملThe Construction of Bilingual Knowledge Bank based on the Synchronous SSTC Annotation Schema
In this paper, we would like to present an approach to construct a huge Bilingual Knowledge Bank (BKB) from a given bilingual corpus based on the idea of synchronous Structured String-Tree Correspondence (SSTC). The SSTC is a general structure that can associate an arbitrary tree structure to string in a language as desired by the annotator to be the interpretation structure of the string, and ...
متن کامل